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ABSTRACT 



An image conversion system for converting monoscopic 
images for viewing in three dimensions including: an input 
means adapted to receive the monoscopic images; a pre- 
liminary analysis means to determine if there is any conti- 
nuity between a first image and a second image of the 
monoscopic image sequence; a secondary analysis means 
for receiving monoscopic images which have a continuity, 
and analyzing the images to determine the speed and direc- 
tion of 'motion, and the depth, size and position of objects; 
a first processing means for processing the monoscopic 
images based on data received from the preliminary analysis 
means or the secondary analysis means; a second processing 
means capable of further processing images received from 
the first processing means; a transmission means capable of 
transferring the processed images to a stereoscopic display 
system. 

65 Claims, 8 Drawing Sheets 
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Figure 1. 
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Figure 3. 
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Figure 4. 
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IMAGE PROCESSING METHOD AND 
APPARATUS 

This application is a continuation of International Appli- 
cation Serial No. PCT/AU98/00716 filed Sep. 2, 1998, the 
teachings of which are incorporated herein by reference, 
which in turn claims priority from Australian Serial No. PO 
8944 filed Sep. 2, 1997. 

FIELD OF INVENTION 

The present invention relates generally to stereoscopic 
image systems, and in particular to the synthesis of stereo- 
scopic image pairs from monoscopic images for stereo- 
scopic display. The present invention may also be directed 
towards a five module method for producing stereoscopic 
images, that digitises a monoscopic source, analyses it for 
motion, generates the stereoscopic image pairs, optimises 
the stereoscopic effect, transmits or stores them and then 
enables them to be displayed on a stereoscopic display 
device. 

BACKGROUND ART 

The advent of stereoscopic or three dimensional (3D) 
display systems which create a more realistic image for the 
viewer than conventional monoscopic or two dimensional 
(2D) display systems, requires that stereoscopic images be 
available to be seen on the 3D display systems. In this regard 
there exists many monoscopic image sources, for example 
existing 2D films or videos, which could be manipulated to 
product stereoscopic images for viewing on a stereoscopic 
display device. 

Preexisting methods to convert such monoscopic images 
for stereoscopic viewing do not product acceptable results. 
Other attempts in film and video have used techniques to 
duplicate the stereoscopic depth cue of "Motion Parallax". 
These involved producing a delay for the images presented 
to the trailing eye when laterals, left or right, motion is 
present in the images. Other attempts have used 'Lateral 
Shifting' of the images to the left and right eyes to provide 
depth perception. 

However, these two techniques are limited and generally 
only suit specific applications. For example, the Motion 
Parallax technique is only good for scenes with left or right 
motion and is of limited value for the stereoscopic enhance- 
ment of still scenes. The Lateral Shifting technique will only 
give an overall depth effect to a scene and not allow different 
objects at varying depths to be perceived at the depths where 
they occur. Even the combination of these two techniques 
will only give a limited stereoscope effect for most 2D films 
or videos. 

Some existing approaches demonstrate limitations of 
these techniques. When an image has vertical motion and 
some lateral motion and a delay is provided to the image 
presented to the trailing eye then the result is often a large 
vertical disparity between the left and right views such that 
the images are uncomfortable to view. Scenes with contra 
motion, such as objects moving left and right in the same 
scene are also uncomfortable to view. Certain embodiments 
of these methods define that when objects of varying depths 
are present in an image there is a distinct ' card board cut-out' 
appearance of the objects with distinct depth modules rather 
than a smooth transition of objects from foreground to 
background. 

In all these approaches no successful attempt has been 
made to develop a system or method to suit all image 
sequences or to resolve the problem of viewer discomfort or 
to optimise the stereoscopic effect for each viewer or display 
device. 
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OBJECTS OF THE INVENTION 

There is therefore a need for a system with improved 
methods of converting monoscopic images into stereoscopic 
image pairs and a system for providing inproved stereo- 
scopic images to a viewer. 

An object of the present invention is to provide such a 
system with improved methods. 

SUMMARY OF INVENTION 

In order to address the problems noted above the present 
invention provides in one aspect a method for converting 
monoscopic images for viewing in three dimensions includ- 
ing the steps of: 

receiving said monoscopic images; 
analysing said monoscopic images to determine charac- 
teristics of the images; 
processing said monoscopic images based on the deter- 
mined image characteristics; 
outputting the processed images to suitable storage and/or 

stereoscopic display systems, 
wherein analysing of said monoscopic images to deter- 
mine the motion includes the steps of: 
dividing each image into a plurality of blocks, wherein 
corresponding blocks on an adjacent image are offset 
horizontally and/or vertically; and 
comparing each said block with said corresponding 
blocks to find the minimum mean square error and 
thereby the motion of the block. 
An image conversion system for converting monoscopic 
images for viewing in three dimensions including: 

an input means adapted to receive monoscopic images; 
a preliminary analysis means to determine if there is any 
continuity between a first image and a second image of 
the monoscopic image sequence; 
a secondary analysis means for receiving monoscopic 
images which have a continuity, and analysing the 
images to determine at least one of the speed and 
direction of motion, or the depth, size and position of 
objects, wherein analysing of said monoscopic images 
to determine the motion includes the steps of: dividing 
each image into a plurality of blocks, wherein corre- 
sponding blocks on an adjacent image are offset hori- 
zontally and/or vertically, and comparing each said 
block with said corresponding blocks to find the mini- 
mum mean square error and thereby the motion of the 
block; 

a first processing means for processing the monoscopic 
images based on data received from the preliminary 
analysis means and/or the secondary analysis means. 

Ideally, the input means also includes a means to capture 
and digitise the monoscopic images. 

Preferably the image analysis means is capable of deter- 
mining the speed and direction of motion, the depth, size and 
position of objects and background within an image. 

In a further aspect the present invention provides a 
method of optimising the stereoscopic image to further 
improve the stereoscopic effect and this process is generally 
applied prior to transmission, storage and display. 

In yet a further aspect the present invention provides a 
method of improving stereoscopic image pairs by adding a 
viewer reference point to the image. 

In still yet a further aspect the present invention provides 
a method of analysing monoscopic images for conversion to 
stereoscopic image pairs including the steps of: scaling each 
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image into a plurality of regions; comparing each region of parallax zones, reference points, movement synthesis and 

a first image with corresponding and adjacent regions of a parallax modulation techniques, 
second image to determine the nature of movement between 

said first image and said second image. 3) Detection and Correction of Reverse 3D 

Preferably a motion vector is defined for each image 5 „ . . , „ , A A , _ „^ „ 
, , . r.u * , Reverse 3D is ideally detected as part of the 3D Genera- 
based on a comparison of the nature of motion detected with t . , * > ,« 4 . r . , . ^ c 

predefined motion categories ranging from no motion to a U0D pr °f 85 b * ™^ n ? the motl ° n characteristics of an 

complete scene change unage i ; techniques may then employed to mim- 

In yet a further aspect the present invention provides a m,se Reverse 30 50 as t0 mlmlnlse vlewer 

system for converting monoscopic images for viewing in 10 4) „ in ^ AppUcations _ lncludes 

three dimensions including: Transmission and Storage 

a first module adapted to receive a monoscopic image; 

a second module adapted to receive the monoscopic P resent i™*tion discloses a .technique apphcable to 

image and analyse the monoscopic image to create a broad range of applications and describes a complete 

image date, wherein analysing of said monoscopic 15 Process for applying me stereoscopic conversion process to 

image to determine the motion includes the steps of: monoscopic applications. The present invention 

dividing each image into a plurality of blocks, wherein INTRODUCTION 
corresponding blocks on an adjacent image are offset 

horizontally and/or vertically, and comparing each said Humans see by a complex combination of physiological 

block with said corresponding blocks to find the mini- 20 and psychological processes involving the eyes and the 

mum mean square error and thereby the motion of the brain. Visual perception involves the use of short and long 

block; term memory to be able to interpret visual information with 

a third module adapted to create stereoscopic image pairs known and experienced reality as defined by our senses. For 

from the monoscopic image using at least one piede- M instance, according to the Cartesian laws on space and 

termined technique selected as a function of the image perspective the further an object moves away from the 

data; viewer the smaller it gets. In other words, the brain expects 

a fourth module adapted to transfer the stereoscopic lhat * m ob i ect is lar S e il fe close t0 ±e viewer and tf il is 

image pairs to a stereoscopic display means; sma11 11 * somc distance off. This is a learned process based 

£r .. , , r . j ■ i on knowing the size of the object in the first place. Other 

a fifth module consisting of a stereoscopic display means. 30 . , t , J it _ 4 , r 4 ,. 

n_ r i_i ,l ^ . j i • £L j * f. t monoscopic or minor depth cues that can be represented m 

Preferably the first module is further adapted to convert . , . £ / , L j 5 r 

. J . . , . . A , a1 , visual information are for example shadows, defocussine. 

any analogue images into a digital image. Also, the second 4 . „ r , t . « r ° 

a i • f a * a * a * * w ♦ • texture, ught, atmosphere, 

module is preferably adapted to detect any objects in a scene r 

and make a determination as to the speed and direction of de P th ^ are ^ ed to S reat advantage in the 

any such motion. Conveniently, the image may be com- 35 production of 'Perspective 3D' video games and computer 

pressed prior to any such analysis. graphics. However, the problem with these techniques in 

Preferably the third module further includes an optimisa- achieving a stereoscopic effect is that the perceived depth 

lion stage to further enhance the stereoscopic image pairs cannot be quantified: it is an illusion of displaying 2D 
prior to transmitting the stereoscopic image pairs to the" ob J ects m a 20 environment. Such displays do not look real 

stereoscopic display means. Conveniently, the fourth mod- 40 as ^ey do not show a stereoscopic image because the views 

ule may also include a storage means for storing the stereo- t0 bom eyes are identical. 



scopic image pairs for display on the stereoscopic display 
means at a later time. 



DEPTH CUES 



Stereoscopic images are an attempt to recreate real world 

ADVANTAGES 45 visuals, and require much more visual information than 

It will be appreciated that the process of the present 'Perspective 3D' images so that depth can be quantified. The 

invention can be suspended at any stage and stored for stereoscopic or major depth cues provide this additional data 

continuation at a later time or transmitted for continuation at so that a P erson ' s visual perception can be stimulated in 

another location if required. $a** dimensions. These major depth cues are described as 

The present invention provides a conversion technology 50 ° ° WS * 

with a number of unique advantages including: D , ., e . f t t , t . . 

n " & Retinal Disparity — refers to the fact that both eyes see a 

1) Realtime or Non-realtime Conversion slightly different view. This can easily be demonstrated by 

holding an object in front of a person's face and focussing 

The ability to convert monoscopic images to stereoscopic 55 on me background. Once the eyes have focused on the 

image pairs can be performed in realtime or non-realtime. background it will appear as though there are actually two 

Operator intervention may be applied to manually manipu- objects in front of the face. Disparity is the horizontal 

late the images. An example of this is in the conversion of distance between the corresponding lefts and right image 

films or videos where every sequence may be tested and points of superimposed retinal images. While Parallax is 

optimised for its stereoscopic effect by an operator. 60 thc actual spatial displacement between the viewed 

2) Techniques Include Stereoscopic Enhancement ~ x S * n „ ~ , 4 . 4 . 

y n r 2) Motion Parallax — Those objects that are closer to the 

The present invention utilises a plurality of techniques to viewer will describes on the one hand techniques for 3D 

further enhance the basic techniques of motion parallax and Generation where both the image processing equipment 

lateral shifting (forced parallax) to generate stereoscopic 65 and stereoscopic display equipment are located substan- 

image pairs. These techniques include but are not limited to tially at the same location. While on the other hand 

the use of object analysis, tagging, tracking and morphing, techniques are defined for generation of the stereoscopic 
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image pairs at one location and their transmission, storage 
and subsequent display at a remote location. 

5) Can be Used With Any Stereoscopic Display 
Device 

The present invention accommodates any stereoscopic 
display device and ideally has built in adjustment facilities. 
The 3D Generation process can also take into account the 
type of display device in order to optimise the stereoscopic 
effect. 

BRIEF DESCRIPTION OF FIGURES 

The invention will be more fully understood from the 
following description of a preferred embodiment of the 
conversion method and integrated system and as illustrated 
in the accompanying figures. It is, however, to be appreci- 
ated that the present invention is not limited to the described 
embodiment. 

FIG. 1 shows the breakdown into modules of a complete 
system utilising the present invention. 

FIG. 2 shows a possible use of multiple processors with 
a complete system utilising the present invention. 

FIG. 3 shows a flow diagram of Module 1 (Video 
Digitising) and the first part of Module 2 (Image Analysis). 

FIG. 4 shows the second part of a flow diagram of Module 

2. 

FIG. 5 shows the third part of a flow diagram of Module 

2. 

FIG. 6 shows the fourth part of a flow diagram of Module 

2. 

FIG. 7 shows a flow diagram of the first part of Module 
3 (3D Generation). 

FIG. 8 shows the second part of a flow diagram of Module 
3 and Module 4 (3D Media — Transmission & Storage) and 
Module 5 (3D Display). 

DETAILED DESCRIPTION 

The present invention aims to provide a viewer with a 
stereoscopic image that uses the full visual perception 
capabilities of an individual. Therefore it is necessary to 
provide the depth cues the brain requires to interpret such 
images, appear to move faster even if they are travelling at 
the same speed as more distant objects. Therefore relative 
motion is a minor depth cue. But the major stereoscopic 
depth cue of lateral motion is the creation of motion paral- 
lax. With motion in an image moving from right to left, the 
right eye is the leading eye while the left eye becomes the 
trailing eye with its image being delayed. This delay is a 
normal function of our visual perception mechanism. For 
left to right motion the right eye becomes the trailing eye. 
The effect of this delay is to create retinal disparity (two 
different views to the eyes), which is perceived as binocular 
parallax thus providing the stereoscopic cue known as 
Motion Parallax. 

3) Accommodation — The eye brings an object into sharp 
focus by either compressing the eye lens (more convex 
shape for close object) or expanding the eye lens (less 
convex shape for far object) through neuromotor activity. 
The amount and type of neuromotor activity is a stereo- 
scopic cue for depth in an image. 

4) Convergence — Is the response of the eye's neuromotor 
system that brings images of an object into alignment with 
the central visual area of the eyes such that only one 
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object is seen. For example, when a finger held at arms 
length is viewed by both eyes and slowly brought towards 
the face, the eyes rum inwards (converge) indicating that 
the finger is getting closer. That is, the depth to the finger 
5 is decreasing. 

The eyes convergence response is physiologically linked 
to the accommodation mechanism in normal vision. In 
stereoscopic viewing, when viewers are not accommodated 
10 to the 'Fixation Plane' (that to which the eyes are 
converged), they may experience discomfort. The 'Plane of 
Fixation' is normally the screen plane. 

OVERVIEW — 5 MODULE APPROACH 

15 The present invention describes a system that is capable 
of taking any monoscopic input and converting it to an 
improved stereoscopic output. For ease of description this 
complete system can be broken down into a number of 

2Q independent modules or processes, namely: 

MODULE 1 — Monoscopic Image Input (typically video 
input) 

MODULE 2 — Image Analysis 
MODULE 3— 3D Generation 
25 MODULE 4 — 3D Media (Transmission or Storage) 
MODULE 5— 3D Display 

FIG. 1 shows this top down approach to the stereoscopic 
conversion process, where video or some other monoscopic 
image source is input, images are analysed, stereoscopic 

30 image pairs are generated, transmitted and/or stored and then 
displayed on a stereoscopic display. Each Module describes 
an independent process of the complete system from mono- 
scopic image input to stereoscopic display. However, it will 
be appreciated that the various modules may be operated 

35 independently. 

APPLICATIONS 

Generally, all five modules are used, from monoscopic 

40 image input to display for a particular application. For 
example, this system may be used in theatres or cinemas. In 
such an application the 2D video input can take the form of 
analogue or digital to the video sources. These sources 
would then be analysed to determine speed and direction of 

45 any motion. The processes would then work in either 
real-time or non real-time in order to create the 3D images. 
This can be further optimised through the use of borders, 
parallax modification, reverse 3D analysis, shading, and/or 
texturing. The 3D images may then be stored or transmitted 

50 to a 3D display, including shutterglasses, polarising glasses 
or an autostereoscopic display. 

This system may also be adapted for use with cable or 
pay-TV systems. In this application the 2D video input could 
be video from a VTR, a laser disc, or some other digital 

55 source. Again the 3D Generation and/or optimisation can 
proceed in either real time or non real time. The 3D media 
module would conveniently take the form of transmission 
via cable or satellite to enable 3D display on TV, video 
projector, or an auto stereoscopic display. 

60 The system may also be used with video arcade games, in 
multimedia, or with terrestrial or network TV. Depending on 
the application the 2D video input module may obtain 
source monoscopic images from a games processor, video 
from a laser disc, video from VTR, video from a network, or 

65 some other digital storage device or digital source or telecine 
process. The 3D Generation can take place in real time or 
non real time, and be generated by computer at a central 
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conversion site, in a user's computer, on a central processor, What is important for the conversion process of the 
or some other image processor. The stereoscopic images can present invention is that a monoscopic image source be 
then be stored on video or other digital storage device, prior provided. It is noted that a stereoscopic image source may be 
to distribution to cinemas or transmission by a local net- provided which would generally obviate the need for mod- 
work. These stereoscopic images may also be transmitted to 5 u i es 1 to 3, however, any such stereoscopic image may be 
video projectors via a local transmission, or alternatively via passcd th^gh ^ optimisation stage prior to display. 
VHF/UHF facilities or satellite. 

The 3D display is dependent on the application required, Module 2 — Image Analysis 

and can take the form of an auto stereoscopic display device, _ f . t _ T _„ _ 0 .... „ 

a video projector with polarising glasses, a local monitor in A Referrmg now to FIGS. 3 to 8 which show flow diagrams 

with shutter-glasses, or a set-top box with suitable viewing 10 demonstrating a preferred arrangement of the present inven- 

glasses. Uon - 

Single & Multiple Processors Following reception of 2D images, digitised video or 

The complete system can be operated on a single proces- digital image data is processed on a field by field or image 

sor with all five modules being processed together or indi- by image basis in realtime or non-realtime by hardware, 

vidually in realtime or non-realtime (Modules 2, 3 and 4). 15 software or by a combination of both. Firstly, the image 

Modules 2 and 3 can be further segmented to suit a multi- analysis process occurs including the steps of: 

tasking or multiprocessor environment, as can be seen in 1) Image compression. 

FIG. 2 for example. 2 ) Motion detection. 

The use of multiple processors can also be configured to ^\ object detection 

the application on hand. For example, modules 1 and 2 could . Motio al *s 

be handled by a first processor, and modules 3 to 5 by a . ' ° n ana ' ! s " 

second processor. If desired, the first processor of this J mage repression 

4 i.. . i 1 u a a Compression or tne image is not essential, however, tor 

arrangement could be used as a look-ahead processor, and y , - - . . . 

j i j * *l * • * many processes and applications, compression is a practical 

the second processor could generate the stereoscopic images . J r , . , , f At . • ^ . 

after a delay. Alternatively, a first processor could be used to 25 option particularly, where the processor is not powerful 

receive realtime video, digitise the video and transfer the , to P rocess a m resolution image m the tone 

digitised video to a suitable digital storage device. A second re( J! UI ^ , „ 

processor, either on site or remotely, could then analyse the Preferably the images are scaled to smaUer dimensions, 

digitised image and perform the necessary tasks to display a ™ e ""J* fact ° r . ls de P e ^nt on the digital video , resota- 

stTreoscop.c image on a suitable display device. 30 Uon used for f^ch image, and is usually defined by the type 

Look-ahead processing techniques may be employed to ° f ™ a | e ^ m pr ° CeSS - 

predict trends in sequences of film or video so that the image 0 on £ e _ 10 " . . . , , . 

r j u rr- ,i i * * ** In a preferred embodiment each image may be analysed 

processing modes may be more efficiently selected to opti- . • i i . j r L 

v.. cia „' • m n mr + in blocks of pixels. A motion vector is calculated for each 

mise the overall stereoscopic effect. ^ , . , , * . . _ . . . 

The present invention is primarily concerned with the 35 bl °* by j ,°™ T , 

analysis of monoscopic images and conversion of the mono- ^ponding blocks from an adjacent image that are offset 

scopic images into stereoscopic image pairs together with h °"2ontally. «f/f vertically by up to a predetermined 

the optimisation of the stereoscopic effect. In this regard the of pixels, for example »9, and recording the position 

r . . . r Li T u j e — that gives the minimum Mean Squared Error, 

present invention is applicable to a broad range of mono- r. •_ i_i i L . * 

. • • 40 For each block, the vector and minimum and maximum 

scopic inputs, transmission means and viewing means. w r. ^ -. r , , 

XT r r c i . ,, a a a a a i mil Mean Square Error are recorded for later processing. 

However, for completeness all five defined modules will be r« • , j l r i . j 

described herein* save OD P 1 * 006551 ^ tmie > vectors need not be calculated 

if there is no detail in the block, for example, when the block 

Module 1 — Image or Video Input is a homogeneous colour. 

Module 1 requires that a monoscopic image source or 45 Other methods for calculating the motion can be utilised, 

video input is provided. This source may be provided as for example image subtraction. The present embodiment 

either a digital image source or an analogue image source uses the Mean Squared Error method, 

which may then be digitised. These image sources may 3) Object Detection 

include: An. Object is defined as a group of pixels or image 

1) Analogue Source 50 elements that identify a part of an image that has common 

a) Tape based— VCR/VTR or Film. features. Those characteristics may relate to regions of 

b) Disk based— Laser Disk. similar luminance value (similar brightness), chrominance 

v w .j ^ w • a value (similar colour), motion vector (similar speed and 

c) Video Camera or other realtime image capture device. t . v - x 7 • < ; / . 

^ , . 7 . direction of motion) or similar picture detail (similar pattern 

d) Computer generated images or graphics. $$ QT 

2) Digital Source P or examp i e a car driving past a house. The car is a region 

a) Tape based— Typical examples are DAT, AMPEX's 0 f pixels or pixel blocks that is moving at a different rate to 
DCT, SONY'S Digital Betacam, Panasonic's digital ^ background . [f the car stopped in front of the house then 
video formats or the new Digital Video Cassette (DVC) me ^ would te difficult l0 detecl> md other methods may 
format using 6.5 mm tape. 60 be ^ 

b) Disk based storage — Magneto Optical (MO) hard disk a connectivity algorithm may be used to combine the 
(HD), compact disk (CD), Laser Disk, CD-ROM, DAT, motion vectors into regions of similar motion vectors. An 
Digital Video Cassette (DVC) or Digital Video Disk Object may be comprised of one or more of such regions. 
(DVD) based data storage devices— uses JPEG, MPEG other image processing algorithms, such as edge detection 
or other digital formats. 65 etC| mav be used in the detection of Objects. 

c) Video Camera or other realtime image capture device. Once Objects are identified in an image they are prefer- 

d) Computer generated images or graphics. ably tagged or given an identification number. These Objects 
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and their relevant details (for example position, size, motion eye can be maintained. This helps in smoothing the stereo- 
vector, type, depth ) are then stored in a database so that scopic effect by enabling the image processor to predict any 
further processing may occur. If an Object is followed over motion trends and to react accordingly by modifying the 
a sequence of images then this is known as Object Tracking. delay so that there are no sudden changes. 
By tracking Objects and analysing their characteristics they s [f a scene change is detected the Field Delay for the 
can be identified as being foreground or background Objects prcferred embodiment of the present invention is set to zero 
and therefore enhanced to emphasise their depth position in t0 prevent the image Drealdng apart the Field Delay 

4) Mofon Analysis hist0ry * ^ reset * Field Delay history * P rcferabl y rcset 

Once Objects have been detected, the Objects can be ?? each scene change, 

analysed to determine the overall speed and direction of 10 ' . im ^ c , 411 

motion in the image. In the preferred embodiment, this stage Asimple pan describes a lateral motion trend over a series 

determines the type of motion in the image, and also ofimages whereby the majonty of analysed motion is in one 

provides an overall vector. direction. This will preferably also cover the situation where 

By using the Object Detection information and comparing me ma J orit y of the scene has a consistent motion, and no 

the data to several image motion models a primary deter- 15 stationar y ob J ects are detected in the foreground, 

mination can be made as to the best method to convert Asimple pan can be detected as the major Object having 

monoscopic images to stereoscopic image pairs. a non 2610 motlon vector. 

The image motion models as used in the preferred The result of a simple pan is that a positive motion vector 

embodiment of the present invention are: 15 g cnera t ed if the scene is moving to the right (or panning 

a) Scene Change. 20 left )* In ^ case ' tne to the ri S nt eve wiu be delayed. 

b) Simple Pan Similarly, a negative motion vector is generated if the scene 
^ P ' is moving to the left (or panning right). In this case, the 

c) complex Fan. tQ the kft eye ^ be delaycd 

d) Moving Object over stationary background. c ) Complex Pan 

e) Foreground Object over moving background, 2 5 A complex pan differs from a simple pan in that there is 

f) No Motion. significant vertical motion in the image. Therefore, in the 
Other motion models may be used as required. preferred embodiment, to minimise vertical disparity 

a) Scene Change between the stereoscopic image pair sequences, Field Delay 
A scene change as the name suggests is when one image is not applied and only Object Processing is used to create 
has little or no commonality to a previous image or scene. 30 a stereoscopic effect. Field Delay history is stored to main- 
It may be detected as a very large absolute difference in tain continuity with new lateral motion, 
luminance between the two images, or a large difference in d) Moving Object over Stationary Background 
the colours of the two images. A moving object over a stationary background is simply 
In a preferred arrangement a scene change may be deter- the situation whereby the majority of a scene has no motion, 
mined when the median of the differences of luminance 35 and one or more moving Objects of medium size are in 
values (0-255) between previous and current images is scene. This situation also results in a positive motion vector 
typically above 30. This value may vary with application but if the majority of Objects are moving to the right, and a 
trial and error has determined that this value is appropriate negative motion vector if the majority of Objects are moving 
for determining most scene changes. to the left. A positive motion vector produces a delay to the 
A secondary test to determine a scene change can be when 40 right eye, and a negative motion vector produces a delay to 
there are too many regions of motion vectors, which appears the left eye. 

like random noise on the image and is likely due to a scene In the case where the motion vectors of the Objects in the 

change. This may occur if there is a very large amount of scene are not consistent, for example, objects moving to the 

motion in the image. left and right in the same scene, then Contra Motion exists 

A third technique to detect a scene change is to analyse the 45 and Reverse 3D correction techniques may be applied, 

top few lines of each image to detect a scene change. The top e) Foreground Object over Moving Background 

of each image changes the least A Foreground Object over a moving background refers to 

Alternatively, when the majority of motion vector blocks the situation where a majority of the scene has motion, and 
have large error values the difference between the two an Object having a different motion is in the scene, for 
images is too great and will therefore be considered as a 50 example a camera following a person walking. A Back- 
scene change. ground Object is detected as a major Object of non-zero 
Scene Change and Field Delay motion vector (That is, a panning background) behind an 

In the preferred embodiment when there is lateral motion Object of medium size with zero or opposite motion vector 

detected in a scene the image to the trailing eye is delayed to the main Object, or a major Object of zero vector in front 

by an amount of time that is inversely proportional to the 55 of minor Objects of non zero vector that are spread over the 

speed of the motion. For an image moving right to left the entire field (That is, a large stationary object filling most of 

trailing eye is the left eye and for an image moving left to the field, but a pan is still visible behind it), 

right the trailing eye is the right eye. A decision should be made as to whether the foreground 

The image sequence delay (or Field Delay) to the trailing Object should be given priority in the generation of Motion 
eye, may be created by temporally delaying the sequence of 60 Parallax, or whether the background should be given prior- 
video fields to the trailing eye by storing them in digital form ity. If the background contains a large variation in depth (for 
in memory. The current video field is shown to the leading example, trees), then motion vectors are assigned as if a 
eye and the delayed image to the trailing eye is selected from Simple pan was occurring. If the background contains little 
the stored video fields depending on the speed of the lateral variation in depth (for example, a wall) then a motion vector 
motion. 65 is assigned that is antiparallel or negative. 

Over a number of fields displayed, a history as to the When the background contains a large variation in depth, 

change in motion and change in Field Delays to the trailing and a motion vector is assigned to the scene as per Simple 



04/07/2004, EAST Version: 1.4.1 



US 6,496,598 Bl 

11 12 

Pan methods, then the foreground object will be in Reverse digitised images could be stored in a buffer and a single 

3D, and suitable correction methods should be applied. input pointer used with two output pointers, one for the left 

f) No Motion eye image and one for the right eye image. The leading eye's 

If no motion is detected such that the motion vectors are image memory pointer is maintained at or near the current 

entirely zero, or alternatively the largest moving Object is 5 input image memory pointer while the delayed eyes image 

considered too small, then the Field Delay will be set to zero. memory pointer is set further down the buffer to produce a 

This situation can occur where only random or noise motion delayed output. Many images may be stored, up to 8-10 

vectors are determined, or where no motion information is video fields is typical in video applications. The delay is 

available, for example during a pan across a blue sky. dependent on the speed of the motion analysed in the image. 

x« j i ? m o ** 10 Maximum field delay is when there is minimum motion. 

Module 3 — 3D Generation ox „ , n „ J . lc , L£i . s 

2) Forced Parallax (Lateral Shifting) 

Once the images are analysed they can then be processed Forced parallax can be created by introducing a lateral 

to create the stereoscopic image pairs. shift between: 

When viewing a real world scene both eyes see a slightly i) An exact copy of an image and itself 

different image. This is called retinal disparity. This in turn 15 »j ^ ^ fields of a videQ ffame 

produces stereopsis or depth perception. In other words we ^ ^ Qf & ^ ^ 

see stereoscopically by having each eye see a slightly ' ' - 

different image of the same scene. IV ) A transformed copy of an image and its original 

r, „ ,i ,i i a ' ji c j .1. . r A Negative lateral shift is produced by displacing the left 

Parallax on the other hand is defined as the amount of 4 . , 4 . . f\ . f a , f, 

, . , , 1,1 u-r* i *i_ . i • i • on image to the right and the right image to the left by the same 

horizontal or lateral shift between the images which is lu „ ° , , . . ?. , , . 3 c 

, • i j * u «Ti_ . amount (establishes a depth of field commencing from the 

perceived by the viewer as retinal disparity. When a stereo- i . j. c _ L £ j n 

r . . J . . * j *u j* ■ i • screen plane and proceeding in front of it) and a Positive 

scopic image pair is created, a three-dimensional scene is w 1 u*** u j- 1 • *iT i c * . . . ^ , . 

u a c~ *, u • * 11 L'ft j • lateral shift by displacing the left image to the left and the 

observed from two honzontally-shifted viewpoints. .... . . f & , t i . , 

^ . . r. , nght image to the nght by the same amount (establishes a 

The present invention utilises a number of image and depth of fie]d comme ncing from the screen plane and 

object processing techniques to generate stereoscopic image ^ receding behind it). 

pairs from monoscopic images. Forced Parallax may be reduced t0 enhance the stereo- 

These techniques include: scopic effect for a stationary object in front of a pan, where 

1) Motion Parallax. the object is 'placed' closer to the screen plane and the 

2) Forced Parallax (Lateral Shifting). 30 background is 'pushed back" from the defined object plane. 

3) Parallax Zones. 3) Parallax Zones 

4) Image Rotation about the Y-Axis. Because most scenes are viewed with the background at 

5) Object Processing. me to P anc * tne foreground at the bottom it is possible to 
1) Motion Parallax accentuate a scene's depth by 'Veeing' the Forced Parallax. 

When a scene is moving from right to left, the right eye 35 ^ done °y laterally shifting the top of the image more 

will observe the scene first while the left eye will receive a man me bottom of an ima g e thu s accentuating the front to 

delayed image and visa versa for a scene moving in the back de P m observed in a scene. 

opposite direction. The faster the motion the less delay Another technique is to use a combination of Motion 

between the images to both eyes. This is known as motion Parallax and Forced Parallax on different parts of the image, 

parallax and is a major depth cue. Therefore, if there is 40 For example, by splitting the image vertically in half and 

lateral motion in a scene, by creating a delay between the applying different parallax shifts to each side, a scene such 

images to the eyes a stereoscopic effect will be perceived. as look^g forwards from a moving train down a railway 

a) Field Delay Calculation trac ^ bas ^ c corrcct stereoscopic effect. Otherwise one side 
Once the nature of the motion in an image has been would alwa y s a PP ear in Averse 3D. 

analysed and an overall motion vector determined, the 45 4 ) Rotation about the Y-Axis 

required Field Delay can then be calculated. Preferably, the When an ob J ect ^ moving towards the viewer in a real 

calculated Field Delay is averaged with previous delays to WOf l d scene » ^ 0D j ect is rotaled slightly in the view for 

filter out 'noisy' values and also to prevent the Field Delay ea( f h e y e - ^ rota ti°n effect is more pronounced as the 

changing too quickly. object moves closer. Translating this rotation into the ste- 

As stated above, the faster the motion the less delay 50 reoscopic image pairs defines the effect as follows: 

between the image to each eye. Accordingly, smaller values 0 Moving towards the viewer — The left image is rotated 

of Field Delay are used in scenes with large motion vectors, vertically about its central axis in an anti-clockwise 

whereas larger delays are used in scenes with little lateral direction and the right image in a clockwise direction, 

motion. That is, an inverse relationship exists in the pre- ii) Moving away from the viewer — The left image is 

ferred embodiment between the delay and amount of 55 rotated vertically about its central axis in a clockwise 

motion. direction and the right image in an anti-clockwise 

When a scene change is determined, the history of Field direction. 

Delays should be reset to zero, as if no motion had occurred Therefore, by image rotation, the perspective of objects in 

previously. At the first detection of motion when a non zero the image is changed slightly so that depth is perceived. 

Field Delay is calculated whilst the history of Field Delays 60 When this technique is combined with Forced Parallax for 

is still zero, the entire history of Field Delay is set to the certain scenes the combined effect provides very powerful 

calculated Field Delay. This enables the system to immedi- stereoscopic depth cues, 

ately display the correct Field Delay when motion is 5) Object Processing 

detected. Object processing is performed to further enhance the 

b) Field Delay Implementation 65 stereoscopic effect, particularly in still images, by separating 
Motion Parallax can be generated in hardware and soft- the Objects and background so that these items can be 

ware by storing digitised images in memory. Preferably, the processed independently. It is most effective when the 
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objects are Large in size, few in number and occupy distinct longed stereoscopic viewing this can cause eye fatigue and 

depth levels throughout the depth of field. decreased depth perception. Afront or rear projection screen 

A database for Object Tagging and Object Tracking can be display system may also suffer from the same problems, 

used to establish trends so that an Object can be digitally 'cut The present invention therefore preferably also defines a 

out' from its background and appropriate measures taken to s common border or reference point within a viewed image, 

enhance the stereoscopic effect. Once processing has taken Ideally the reference plane is set at the screen level and all 

place the Object is 'Pasted' back in the same position on to depth is perceived behind this level. This has the advantage 

the background again. This can be termed the * Cut and of enhancing the stereoscopic effect in many scenes. 

Paste' technique and is useful in the conversion process. This reference point can be a simple video border or 

By integrating the processes of Object Tagging, Tracking, 10 reference graphic and, for example, may be of the following 

Cutting and Pasting a powerful tool is available for enabling types: 

Object Processing and Background Processing. i) A simple coloured video border around the perimeter of 

Another Object Processing technique is Object Layering the image, 

which defines an independent depth module for each moving {{) A complcx colourcd vidco border ^i^g of ^ or 

Object. The Object can then be placed anywhere on an is more concentric borders that may have opaque or 

image because the background fill detail has been defined transparent sections between them. For example, a 2-3 

when the Object was not in that position. This is not cm ^ mesh border or a ^ ollter border ^ two 

generally possible with a still Object unless the background mm j nner Dorders< 

fill-in is interpolated. iii) A partial border that may occupy any one edge, or any 

A most important issue in stereoscopic conversion is the 20 tWQ horizontal or vertical ed 

correction of Reverse 3D and Accommodation/Convergence . x A T iU • . . « , . , . 4 . . 

... ... . *n • iv) A LOGO or other graphic located at some point within 

unbalances that cause viewer discomfort. Object Processmg . e 

in the preferred embodiment allows corrections to this . e image. 

problem too. v ) A P icture Wlthm a P lcture " 

a) Mesh Distortion and Morphing 25 V1 ) A combination of any of the above. 

This Object processing technique allows an Object to be K essential in this embodiment is that the eyes of the 

cut and pasted onto a distorted mesh to enhance depth viewer be Prided with a reference point by which the 

perception. By distorting an Object in the left eye image to ^P 1 * of the objects in the image can be perceived, 

the right and by distorting the same object in the right eye If a border or g ra P hic is add&d at the 3D Generation level 

image to the left, thus creating Object Parallax, the Object 30 then ix mav bc specified to provide a reference point at a 

can be made to appear much closer to a viewer when using particular depth by creating left and right borders that are 

a stereoscopic display device. laterally shifted from each other. This enables the reference 

b) Object Barrelling or fix^i 011 point to be shifted in space to a point somewhere 
Ibis technique is a specific form of Mesh Distortion and behind or ^ front of the screen leveL Borders or graphics 

refers to a technique of cutting an Object from the image and 35 de faed with no parallax for the left and right eyes will be 

wrapping onto a vertically positioned half barrel. This makes perceived at the screen level. This is the preferred mode of 

the Object appear to have depth by making the centre portion ^ P resent invention. 

of the Object appear closer than the Object edges. A border or reference graphic may be inserted at 

c) Object Edge Enhancement ^ 3D Generation point or it may be defined externally and 
By emphasising the edges of an Object there is greater 40 genlocked onto the stereoscopic image output for display. 

differentiation between the background or other Objects in Such ^ ima e c bordcr or reference graphic may be black, 

an image. The stereoscopic effect is enhanced in many white or coloured, plam or patterned, opaque, translucent or 

applications by this technique. transparent to the image background, or it may be static or 

d) Object Brightness Enhancement dynamic. Whilst a static border is appropriate in most 
In any image the eye is always drawn to the largest and 45 instances, in some circumstances a moving or dynamic 

brightest objects. By modifying an Object's luminance the border ma y be used for motioQ enhancement. 

Object can be emphasised more over the background, 2 ) ParaUax Adjustment— Depth Sensitivity Control 

enhancing the stereoscopic effect. Stereoscopic images viewed through a stereoscopic dis- 

e) Object rotation about Y-axis P lav device . automatically define a depth range (called depth 
Object rotation about the Y-axis refers to a similar process 50 acuity) which can be increased or decreased by modifying 

to that of image rotation about the Y-axis, except that this *P* and of P arallax a PP hed 10 lhe ima g e « 

time the rotation occurs to the Object only. If the Object in objects. It has been found that different viewers have varying 

me stereoscopic image pair is 'Cut* from its background and stereoscopic viewing comfort levels based on the depth 

rotated slightly the change in perspective generated by the ran S e or amount of stereopsis defined by stereoscopic image 

rotation is perceived as depth. 55 P* 115 - Tha f is > while viewers prefer a pronounced 

3D Optimisation stereoscopic effect with a greater depth range, others prefer 

1) Reference Points or Borders an ™ a g e ^ minimal depth. 

When using a normal TV or video monitor to display To ^J^ 1 me level of de P th sensitivity and viewing 

stereoscopic images the eye continually observes the edge of comfort many techniques may be used, namely: 

the monitor or screen and this is perceived as a point of 60 0 Va rying the amount of Motion Parallax by varying the 

reference or fixation point for all depth perception. That is, Field Delay 

all objects are perceived at a depth behind or in front of this ii) Varying the amount of Forced Parallax to an image 

reference point. iii) Varying the amount of Parallax applied to objects 

If the edge of the monitor is not easily seen because of By reducing the maximum level of Parallax the depth 

poor ambient lighting or due to its dark colour then this 65 range can be reduced, improving the viewing comfort for 

reference point may be lost and the eyes may continually those with perception faculties having greater sensitivity to 

search for a fixation point in the 3D domain. Under pro- sterescopy. 



04/07/2004, EAST Version: 1.4.1 



US 6,496 : 

15 

3) Parallax Smoothing 

Parallax Smoothing is the process of maintaining the total 
amount of Parallax (Motion Parallax plus Forced Parallax) 
as a continuous function. Changes in Field Delay for specific 
motion types, that is, Simple Pan and Foreground Object S 
Motion, cause discontinuities in the amount of Motion 
Parallax produced, which are seen as "jumps" in the stereo- 
scopic images by the viewer. Discontinuities only occur in 
the image produced for the trailing eye, as the leading eye 
is presented with an undelayed image. These discontinuities 10 
can be compensated for by adjusting the Forced Parallax or 
Object Parallax in an equal and opposite direction for the 
trailing eye, thus maintaining a continuous total parallax. 

The Forced Parallax or Object Parallax is then adjusted 
smoothly back to its normal value, ready for the next change 15 
in Field Delay. The adjustments made to Forced Parallax by 
Parallax Smoothing are a function of Field Delay change, 
motion type and motion vector. To implement Parallax 
Smoothing, the Forced Parallax for the left and right eye 
images should be independently set. 20 

4) Parallax Modulation 

The Forced Parallax technique of creating a stereoscopic 
effect can also be used to moderate the amount of stereopsis 
detected by the viewer. This is done by varying the Forced 
Parallax setting between a minimum and maximum limit is 
over a short time such that the perceived depth of an object 
or image varies over time. Ideally the Forced Parallax is 
modulated between its minimum and maximum settings 
every 0.5 to 1 second. This enables a viewer to accommo- 
date to their level of stereoscopic sensitivity. 30 

5) Movement Synthesis 

By creating pseudo movement, by randomly moving the 
background in small undetectable increments, the perceived 
depth of foreground objects is emphasised. Foreground 
objects are 'Cut' from the background, the background is 35 
altered pseudo-randomly by one of the techniques below and 
then the foreground object is Tasted' back on to the back- 
ground ready for display. Any of the following techniques 
may be used: 

i) Luminance values varied on a pseudo-random basis 40 

ii) Chrominance values varied on a pseudo-random basis 

iii) Adding pseudo -random noise to the background to 
create movement 

6) Reverse 3D Analysis and Correction ^ 
Reverse 3D occurs, when the depth order of Objects 

created by Parallax is perceived to be different to that 
corresponding to the depth order in the real world. This 
generally leads to viewer discomfort and should be cor- 
rected. When converting monoscopic images to stereoscopic 5Q 
image pairs Reverse 3D may be produced by: 

i) Contra motion, objects moving left and right in the 
same image. 

ii) Objects and background moving in different directions. 

iii) Many objects moving at varying speeds 55 
Reverse 3D is corrected by analysing the nature of the 

motion of the objects in an image and then manipulating 
each Object individually using mesh distortion techniques 
so that the Object Parallax matches with the expected visual 
perception norms. 60 

7) Miscellaneous Techniques 

By modifying the perspective of an object within an 
image and by enhancing many of the minor depth cues the 
stereoscopic effect can be emphasised. The techniques 
below all operate using the 'Cut and Paste* technique. That 65 
is, a foreground object is 'Cut', enhanced and then 'Pasted* 
back on to the background. 



,598 Bl 

16 

a) Shadows — Shading gives an object perspective. 

b) Foreground/Background — By defocussing the 
background, through blurring or fogging, a foreground 
object may be emphasised, while defocussing the fore- 
ground object the background depth may be empha- 
sised 

c) Edge Enhancement — Edges help to differentiate an 
object from its background. 

d) Texture Mapping — Helps to differentiate the object 
from the background. 

Module 4 — 3D Media (Transmission & Storage) 

As for module 1, modules 4 and 5 are not essential to the 
present invention. Module 4 provides for the transmission 
and/or storage of the stereoscopic images. The transmission 
means can be adapted for a particular application. For 
example the following can be employed: 

1) Local Transmission — can be via coax cable 

2) Network TV Transmission — can be via 

i) Cable 

ii) Satellite 

iii) Terrestrial 

3) Digital Network— INTERNET, etc 

4) Stereoscopic (3D) Image Storage 

An image storage means may be used for storage of the 
image data for later transmission or display and may 
include: 

i) Analogue Storage — Video Tape, Film, etc 

ii) Digital Storage— Laser Disk, Hard Disk, CD-ROM, 
Magneto Optical Disk, DAT, Digital Video Cassette 
(DVC), DVD. 

Module 5— 3D Display 

As for the transmission means the display means can be 
dependent on the application requirements and can include: 

1) Set-top Box 

A set-top box by definition is a small box of electronics 
that receives, decodes, provides accessories interfaces and 
finally has outputs to suit the application. It may incorporate 
the following: 

a) Video or RF receiver. 

b) Stereoscopic (3D) decoder to provide separate left and 
right image outputs to Head Mounted Devices or other 
stereoscopic displays where separate video channels 
are required. 

c) Resolution Enhancement — Line Doubling/Pixel Inter- 
polation. 

d) Shutter or Sequential Glasses Synchronisation. 

e) Stereoscopic depth sensitivity control circuitry. 

f) Accessories interface — remote control with features 
such as a 2D/3D switch and Depth control. 

g) Audio interface — audio output, headphone connection. 

h) Access channel decoding— cable and pay TV applica- 
tions. 

i) Video or RF outputs. 

2) Stereoscopic Displays 

Use special glasses or head gear to provide separate 
images to the left and right eyes including: 

a) Polarising glasses — Linear and Circular polarisers. 

b) Anaglyphic glasses — Coloured lenses — red/green, etc. 

c) LCD Shutter glasses. 

d) Colour Sequential Glasses. 
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e) Head Mounted Devices (HMD) — Head gear fitted with further object processing on selected objects within that 

two miniature video monitors (one for each eye), VR scene to correct for the Reverse 3D effect. This can be 

headsets. achieved through using mesh distortion and morphing tech- 

3) Autostereoscopic Displays niques. 

a) Video Projector/Retroreflective screen based display 5 If the motion is categorised as being a stationary object on 
systems. a moving background, it is then necessary to consider 

b) Volumetric display systems. whether the background has a large variation in depth. If it 

c) Lenticular lens based display systems. does not > ^ we a Field Dela y with the ob i ect havin e 
jx „ ! i * f\ t' t ci * mnn u j j • | priority using the principles of motion parallax. However, if 

d) Holographic Optical Element (HOE) based display 30 the background does have krge variaUon in depth, then we 
svstems *■ 

J ' apply a Field Delay with the background having priority as 

PREFERRED EMBODIMENT opposed to the object, again using the principles of motion 

parallax. In this case, it is then also necessary to perform 

In summary, the present invention provides in a preferred further object processing on the foreground object to correct 

embodiment a system that is capable of inputting mono- 15 for the Reverse 3D effect prior to being passed on for further 

scopic image sequences in a digital format, or in an analogue processing. 

format in which case an analogue to digital conversion If no motion is detected, then we next consider whether an 

process is involved. This image data is then subjected to a object in the scene was known from any previous motion. If 

method of image analysis whereby the monoscopic images this is so, then we perform object processing on that selected 

are compressed, if this is required for the particular appli- 20 object. If not, then we apply an object detection process to 

cation. that particular scene in order to attempt to identify any 

By comparing blocks of pixels in an image, with corre- objects in it. If an object is identified, then we perform object 

sponding blocks in an adjacent image, and by obtaining the processing on that particular object, if not, Forced Parallax 

minimum Mean Square Error for each block, motion within and 3D Optimisation is performed. 

the image can be determined. Where object processing is required, objects are 

Following motion detection, regions of an image are identified, tagged and tracked, and then processed by using 

identified for similar characteristics, such as, image techniques of mesh distortion and morphing, object 

brightness, colour, motion, pattern and edge continuity. The baralleling, edge enhancement, brightness modification and 

data is then subjected to motion analysis in order to deter- 30 object rotation. 

mine the nature of the motion in the image. This motion In all cases, once the motion has been categorised and the 
analysis takes the form of determining the direction, speed, primary techniques to convert to stereoscopic images have 
type, depth and position of any motion in the image. This been applied, then a further amount of parallax or lateral 
motion is then categorised into a number of categories shifting called forced parallax is applied to the image. It is 
including whether the motion is a complete scene change, a 35 noted that in the preferred embodiment, forced parallax is 
simple pan, a complex pan, an object moving on a stationary applied to every image, not just for depth smoothing pur- 
background, a stationary object in front of a moving poses but to provide an underlying stereoscopic effect that 
background, or whether there is no motion at all. Further all images are seen as having depth behind or in front of the 
actions are then determined based on these categories to stereoscopic display device's reference plane, generally the 
convert the monoscopic images into stereoscopic image 40 front of the monitor screen. The advantages of applying 
pairs suitable for viewing on an appropriate stereoscopic forced parallax are that the system is better able to cope with 
display device. changes in the category of the motion detected without 

In the preferred embodiment, once the monoscopic causing sudden changes in the viewers depth perception, 

images are analysed, if a scene change or a complex pan is Once the forced parallax has been applied to the image, 

detected then no further analysis of that particular scene is 45 the image is then passed for 3D Optimisation. Again, this is 

required, rather the Field Delay and Field Delay history are not necessary in order to see a stereoscopic image, however 

both reset to zero. An object detection process is then the optimisation does enhance the image's depth perception 

applied to the new scene in order to try and identify objects by the viewer. The 3D Optimisation can take in a number of 

within that scene. Once these objects are identified, then forms including the addition of reference points or borders, 

object processing takes place. If no objects are identified, 50 parallax modulation, parallax smoothing and parallax 

then the image is passed on for further processing using adjustment for altering the depth sensitivity of any particular 

forced parallax and 3D optimisation. viewer. The image can also be optimised by modifying 

If the motion categorised during the image analysis is not luminance or chrominance values pseudo randomly so that 

a scene change, then further analysis of that scene is background motion behind foreground objects can be 

required. If further analysis of that scene results in the 55 observed so that the depth perception is enhanced. It is also 

motion being categorised as a simple pan, then it is neces- possible to analyse for Reverse 3D so that a viewers eye- 

sary to apply a Field Delay in accordance with the principles strain is minimised. Further techniques such as shadowing, 

of motion parallax. It is then passed on for further process- foreground and background fogging or blurring and edge 

ing. If the motion is not categorised as a simple pan, but enhancement of the image can also be carried out in this 

rather as an object in motion on a stationary background, 60 stage. 

then again we have to apply a Field Delay in accordance Once the image has been optimised it is then transmitted 

with the principles of motion parallax. In this regard, once to the appropriate display device. This transmission can take 

the motion parallax has been applied, it is necessary to a number of forms including cable, co -axial, satellite or any 

consider whether the objects all have a uniform direction. If other form of transmitting the signal from one point to 

the objects do move in a uniform direction, then it is passed 65 another. It is also possible that,the image could be stored 

on for further processing at a later stage. If the objects do not prior to being sent to a display device. The display device 

have a uniform direction, then it is necessary to perform can take on a number of forms, and only need be appropriate 



04/07/2004, EAST Version: 1.4.1 




US 6,496,598 Bl 

19 A 20 

the applicatioQ in band, for example, it is possible to use continuity is viewed by one eye of a viewer prior to being 
existing video monitors with a set top device in order to viewed by the other eye of the viewer, 
separate the left and right images, increase the scan rate and 12. The method as claimed in claim 1, wherein said 
to synchronise viewing glasses. Alternatively, dedicated analyzing includes identifying objects within the mono- 
stereoscopic displays can be used which incorporate the use s scopic images to assist said processing, 
of glasses or head gear to provide the stereoscopic images or . 13 ; ^ method as claimed in claim 12, wherein said 
alternatively) an auto-stereoscopic display device can be identifying includes comparing luminance values, chromi- 
used. It is envisaged that the present invention will have nance val , ues ' motion ve , ctors and/or P icture details of ad J a " 

application in theatres, cinemas, video arcades, cable or cent / 1 i5 ls or SPf 5 of P? els ' 

. , ™, . j i i • *i_ i 14. lhe method as claimed m claim 1, wherein said 

network TV, in the education area, particularly in the mul- 10 , ■ ■ j 7 7 . , ' 7 .T- 

. , ' - « • 7 u analyzing includes determinmg the motion of objects within 

I industry and in many other areas such as theme ^ k { tQ gaid ^ 

parks and other entertainment applications. 15 ^ method as daimed ^ daim u ^ 

The claims defining the invention are as follows: motion of the monoscopic images said objects is 

1. A method for convertmg monoscopic images for view- categorized ^ one of a predetermined range of motion 
ing in three dimensions comprising: 15 categories. 

receiving monoscopic images; 16. The method as claimed in claim 14, wherein a motion 

analyzing said monoscopic images to determine charac- vector & defined for each image based on a comparison of 

teristics of the images and to determine if there is any the motion detected with predefined motion categories rang- 

continuity between successive first and second of said m S fro J? no "J 0 * 00 t0 a complete scene change, 

monoscopic images' 20 * * e method as claimed in claim 1, wherein the motion 

^ - . . , , , categories include scene change, simple pan. complex pan, 

processing said monoscopic images based on the deter- moying object? moving background> and no motion 

mined image characteristics and/or the determination if 18 ^ met hod as claimed in claim 1, wherein any said 

there is any continuity between the first and second block without details is not compared with said correspond- 

monoscopic images; and 2j mg b l oc ks. 

outputting the processed images to suitable storage and/or 19. The method as claimed in claim 7, wherein no 

stereoscopic display systems, continuity is assumed when said comparing of the majority 

wherein said analyzing further includes determining the of blocks with said corresponding blocks results in large 

motion of said monoscopic images by: crTOr vaiues - 

dividing each image into a plurality of blocks, wherein 30 20 * T° e method . 15 claimed m claim 1, wherein said 

corresponding blocks on an adjacent image are offset Processing of each image mcludes using motion parallax by 

horizontally and/or vertically; and nitroducmg a field delay such that one eye of a viewer views 

t_ *j i_i i * j j. the image beiore the other eye or the viewer. 

comparing each said block with said corresponding ~« ^ „ . i • j • i • -ia i_ *i_ 

. , . « . . . r ° 21. The method as claimed in claim 20, wherein the 

blocks to find a minimum mean square error and amount of motion ^ ipveisel proportional to the field delay, 

thereby the motion of the block. 35 2 2. The method as claimed in claim 20, former including 

2. The method as claimed in claim 1, wherein said sloring each fleld delay> and averaging me fleld delay for 
processing includes at least one of the following methods: each ncw ^ prcvious field delays , 

motion parallax, forced parallax, parallax zones, image 23. The method as claimed in claim 22, further including 

rotation and/or object processing. deleting each stored field delay when a non-continuity is 

3. The method as claimed in claim 1, wherein said 40 detected. 

monoscopic image is digitized before any said analyzing or 24. The method as claimed in claim 1, wherein said 

said processing is performed. processing of each image includes using forced parallax by 

4. The method as claimed in claim 1, further comprising introducing a lateral shift through displacement of left and 
compressing said monoscopic image prior to said analyzing. right eye images. 

5. The method as claimed in claim 1, further comprising 45 25. The method as claimed in claim 1, wherein said 
scaling said monoscopic image prior to said analyzing. processing of each image includes using parallax zones by 

6. The method as claimed in claim 5, wherein said scaling introducing a greater lateral shift to one portion of the image, 
includes scaling said monoscopic image by a scaling factor 26. The method as claimed in claim 25, wherein a top 
that depends on a digital video resolution of each image. portion of the image is shifted laterally a greater amount than 

7. The method as claimed in claim 1, wherein said 50 & bottom portion of the image. 

analyzing includes analyzing successive first and second 27. The method as claimed in claim 25, further including 

images for continuity before determining the image charac- applying a different parallax shift to a left side of the image 

teristics. as opposed to a right side of the image. 

8. The method as claimed in claim 7, where said analyzing 28. The method as claimed in claim 1, wherein said 
successive first and second images includes comparing 55 processing of each image includes using a combination of 
median luminance values between the successive first and forced parallax and motion parallax on various parts of the 
second images to determine continuity. image. 

9. The method as claimed in claim 8, wherein no conti- 29. The method as claimed in claim 1, wherein said 
nuity is assumed when a difference in the median luminance processing of each image includes rotating left and right eye 
values exceeds 30. 60 images about the y axis an equal amount in an opposite 

10. The method as claimed in claim 7, wherein said direction. 

analyzing successive first and second images includes com- 30. The method as claimed in claim 1, wherein said 

paring the top few lines of the successive images to assist in processing of each image includes using at least one of the 

determining continuity. following object processing techniques: 

11. The method as claimed in claim 7, wherein where no 65 mesh distortion and morphing; object barrelling; object 
continuity is determined said processing includes introduc- edge enhancement; object brightness enhancement; 
ing a field delay to one eye such that the image that lacks and/or object rotation. 



04/07/2004, EAST Version: 1.4.1 



US 6,496,598 Bl 

21 22 

31. The method as claimed in claim 1, further including 43. The image conversion system as claimed in claim 41, 
processing each processed image by applying a forced wherein said first processing means processes the mono- 
parallax to the processed image. scopic images by using at least one of motion parallax, 

32. The method as claimed in claim 31, wherein the forced parallax, parallax zones, image rotation or object 
degree of forced parallax is determined by the amount of S processing. 

parallax added during said processing of each image, such 44 ^ ima S e conversion system as claimed in claim 41, 

that the total parallax added during said processing of each further including a second processing means adapted to 

processed image and the forced parallax, is substantially P rocess thc rcceived from said Processing 

equal to the total parallax of adjacent images. mt A~ S ^i • • i - , - . - 

33. Thc method as claimed in claim 31, wherein the 10 <*■ ™ e ^ conversion system as churned ,n chum 44, 
, r »i_ * j ii- j 1 * j i_ *. j wherein said second processing means uses forced parallax 
degree of the forced parallax is modulated between prede- t u • 

A b . , . . r , . *j , to process each image. 

termined minimum and maximum settings over a predeter- A , ^ . ° „ . , • _ , ■ , ■ . 1 

to r 46. Ine image conversion system as claimed in claim 41, 

m ?f J? 116 T 6 ,' , . . • , - . ^ L . , further including a second processing means adapted to 

34. The method as claimed in claim 1, further including optionally enhancc the ^ gts prior to transmitting the 
optimizing to further enhance the processed images pnor to is images to a stereoscopic display device. 

said outputting of the processed images to the stereoscopic 47. ^ image conversion system as claimed in claim 46, 

display and/or storage systems. wherein said second processing means enhances the images 

35. The method as claimed in claim 1, further including by using at least one of reference points, parallax 
adding a reference point to the processed image. adjustment, parallax smoothing, parallax modulation, move- 

36. The method as claimed in claim 35, wherein the 20 ment synthesis, reverse 3D correction or cut and paste 
reference point is at least one of: techniques. 

a border around the perimeter of the processed image; 48 • system as claimed in claim 41, wherein said input 

a plurality of concentric borders; means is farther adapted to digitize the monoscopic images. 

rt' 1 hn H system as claimed in claim 41, further including 

a partial border; ^ a com p ress i on me ans adapted to compress the monoscopic 

a logo; and/or images prior to analysis by said preliminary analysis means, 

a picture. 50. The system as claimed in claim 41, further including 

37. The method as claimed in claim 1, wherein said a scaling means adapted to scale each monoscopic image 
processing includes adding an amount of depth to the prior to analysis by said preliminary analysis means, 
monoscopic images, the amount being adjustable in 3Q 51. The system as claimed in claim 50, wherein the 
response to a viewer's preference. scaling factor by which each monoscopic image is scaled 

38. The method as claimed in claim 1, wherein said depends on a digital video resolution of each monoscopic 
processing includes randomly moving the background of image. 

each image in increments that are not consciously detectable 52 - ^ system as claimed in claim 41, wherein said 

by a viewer preliminary analysis means is adapted to determine objects 

39. The method as claimed in claim 1, further including 35 ^jnn said monoscopic images. 

testing each image for reverse 3D and manipulating objects ™ G as clauned ^.wherein said 

• j* *j 11 • l ■ * * c J , n preliminary analysis means is adapted to determine the 

individually in each image to compensate for any reverse 3D r J . . j/ *i_ .* r 

. . . r 3 motion of the monoscopic imaees and/or the motion of 

found Airing said I testing. objects within the monoscopic images. 

40 The method as claimed in claim 1, further including 4Q 54 ^ tem as chdmed m claim 53 wnerein &aid 

cut and paste techniques to further emphasize a stereoscopic preliminary analysis means is adapted to categorize the 

e ^ ec{ - motion into one of a predetermined range of motion catego- 

41. An image conversion system for converting mono- nes. 

scopic images for viewing in three dimensions including: 55. The system as claimed in claim 54, wherein the 

an input means adapted to receive monoscopic images; 45 motion categories include at least one of scene change, 

a preliminary analysis means to determine if there is any simple pan, complex pan, moving object, moving 

continuity between a first image and a second image in background, and no motion. 

a sequence of the monoscopic images; 56. The system as claimed in claim 41, further including 

a secondary analysis means for receiving the monoscopic means adapted to control the level of depth added to the 

images which have a continuity, and analyzing the 50 monoscopic images. 

monoscopic images to determine at least one of the 57. The system as claimed in claim 41, further including 

speed and direction of motion, or the depth, size and means adapted to add a reference point to each processed 

position of objects, wherein analyzing of said mono- image. 

scopic images to determine the motion includes: 58. The system as claimed in claim 41, further including 

dividing each image into a plurality of blocks, wherein 55 means for optimizing the processed image to further 

corresponding blocks on an adjacent image are offset improve a stereoscopic effect 

horizontally and/or vertically, and 59. A system for converting monoscopic images for 

comparing each said block with said corresponding viewing in three dimensions including: 

blocks to find a minimum mean square error and a first module adapted to receive a monoscopic image; 

thereby the motion of the block; 60 a second module adapted to receive the monoscopic 

a first processing means for processing the monoscopic image and analyze the monoscopic image to create 

images based on data received from the preliminary image data, wherein analyzing the monoscopic image 

analysis means and/or the secondary analysis means. includes determining the motion of a plurality of mono- 

42. The image conversion system as claimed in claim 41 scopic images by: 

further including a transmission means adapted to transfer 65 dividing each monoscopic image into a plurality of 

the processed images to a stereoscopic display system or a blocks, wherein corresponding blocks on an adjacent 

storage system. image are offset horizontally and/or vertically, and 
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comparing each said block with said corresponding 
blocks to find a minimum mean square error and 
thereby the motion of the block; 
a third module adapted to create stereoscopic image pairs 
from the monoscopic image using at least one prede- 
termined technique selected as a function of the image 
data; 

a fourth module adapted to transfer the stereoscopic 

image pairs; and 
a fifth module including a stereoscopic display means 

adapted to receive the stereoscopic pairs transferred by 

said fourth module. 

60. The system as claimed in claim 59, wherein said first 
module is further adapted to convert an analog monoscopic 
image into a digital monoscopic image. 

61. The system as claimed in claim 59, wherein said 
second module is adapted to detect objects in a scene and to 
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determine to the speed and direction of motion of the 
detected objects. 

62. The system as claimed in claim 59, wherein the 
monoscopic image is compressed prior to said second mod- 
ule analyzing the monoscopic image to create image data. 

63. The system as claimed in claim 59, wherein the third 
module further includes an optimization stage to further 
enhance the stereoscopic image pairs. 

64. The system as claimed in claim 59, wherein operation 
of said system is adapted to be suspended for later process- 
ing by any of said first through fifth modules. 

65. The method as claimed in claim 59, wherein the fourth 
module is adapted store the stereoscopic image pairs and 
transfer the stereoscopic image pairs at a later time. 
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